{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "8f0e85be-7fb0-4027-b1cc-b80aa7c8912c",
   "metadata": {},
   "source": [
    "### 微信聊天记录导出"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "adfa7307-2681-4b57-9baf-94784dd47f46",
   "metadata": {},
   "source": [
    "- https://github.com/LC044/WeChatMsg\n",
    "    - 甚至可以导出为支持 GLM 训练的 json 格式；"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4af4610d-dfbf-4cad-a553-aea251628a20",
   "metadata": {},
   "source": [
    "### misc"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4621f7d2-179f-417b-aecb-f1c1de1b0697",
   "metadata": {},
   "source": [
    "- markitdown\n",
    "    - https://github.com/microsoft/markitdown\n",
    "    - 论文 pdf => markdown\n",
    "        - latex/formula 转换效果不佳；\n",
    "    - 轻量级，简单但也能力有限；\n",
    "- **MinerU**：挺惊艳的；\n",
    "    - https://github.com/opendatalab/MinerU\n",
    "    - 环境配置\n",
    "    - model weights download\n",
    "        - https://github.com/opendatalab/MinerU/blob/master/docs/how_to_download_models_en.md\n",
    "        - 配置文件：`~/magic-pdf.json`\n",
    "    - 命令行使用\n",
    "        - https://mineru.readthedocs.io/en/latest/user_guide/quick_start/command_line.html"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
